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ABSTRACT 



Work towards the completion of a pilot system for machine 
translation of German scientific and technical literature into 
English is described. This report describes efforts performed 
in the area of grammar formalism, programming, and linguistics 
during the period from February 5, 1973 through July 5, 1973; 
it supplements the work performed under contract F30602-70- C-0 1 1 8 
[1,2,31. 

Work on grammar formalism concentrated mainly on increasing 
the power of the subscript grammar to permit the prevention of 
intermediate "forced" readings. Work in system construction con- 
centrated on the completion of the grammar maintenance programs 
and on the core of the systems programs used by all analysis and 
production algorithms. The linguistic work concentrated on the 
coverage of the German surface syntax, the "choice rules" for 
the generation of the corresponding standard structures (deep 
structures), and their grammatical description. 
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INTRODUCTION 



Any attempt to carry out fully automatic quality translation 
of random texts has to take into account the various linguistic 
problems confronting mechanical analysis of natural language and 
must provide solutions for them, without any pre- or post-editing. 

The typical problems arising in the analysis and translation 
of German text into English are 

a. discontinuous constructions, in particular verb prefix 
comb i na t i on s , 

b. idiomatic expressions, 

c. empty words, which are not translated, 

d. deleted terms with or without meaning change, 

e. lexical collocations with internal variable slots, 
dependent on or independent of environment, 

f. category changes in translation, 

g. sentence pattern changes in translation, 

h* translation of the definite article in cases of inalien- 
able property, 

i. ambiguity resolution based on information within a 
sen t ence . 

In addition, the following problems which pertain to sequences 
of sentences have to be accounted for and solved: 

j. anaphoric relations, in particular p ronomi n a 1 - ref e rence , 

k. ambiguity resolution based on information in co-text. 

We finally expect a quality mechanical translation system to 

I. preserve input non-ambiguity* 

The following German sentences provide examples of the 
difficulties which are encountered and need to be solved. In 
each of the German examples, the word or word combinations and 
their English correspondences representing the linguistic problem 
will be underlined; deleted terms are represented by () , Inter- 
nal variable slots by []. Some of the German sentences will first 
be followed by a literal translation Into English and then by the 
cor rect t rans 1 at I on . 
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a* ^cingt mit dzm EKp2,Kimznt an . 

He ca t ches with the experiment a t ♦ 
He beg ins w i th the experiment* 

He 1 e t them notice say that 

He sent word to them that 

P^e Ld6ung, an dzK -6^e [a tango, gza^bz-itzt hatttn^ uouKda 
6(iktiz66li(ih gz^undzn. 

The solution, which they had yes worked on for a long 
time, was finally found. 

The solution, which they had worked on for a long 
time, was finally found. 

d. Vi^ Sonm g aht im O^tan aaj and []^ Im We^^ten unttn. . 

The sun goes in the east u_g^ and in the west down . 

The sun rises in the east and sets in the west. 

¥n.itz i6t nach Spanitn JJ_ and 6zim TKaa JJ_ nadh Itatian 
gtKti6t . 

Fritz is to Spain _(_). ^nd his wife t rave led to Italy. 
Fritz t ra ve 1 ed to Spain and his wife to Italy. 

e. Via Ent{^^ZckZilng nahm [ ihn.(Ln ] knjang . 

The development took [ its ] i ncep t i on . 
The deve 1 opment began . 
Ek tn.a{^ [ ktlnt ] knttattan , zu ... 
He me t [no] i ns t i t ut i ons to ... 
He made no p repa rat i ons to ... 

f. Sie. zxpzKimzntitKttn toe^ te^ . 
They experimented further . 

They con t i n ued to experiment. 
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g* Viz6tK ilm^tand kam ihmn zu Hil^z. 

This fact came them to assistance. 

This fact came to their assistance. 

h. E.^ kKduztd diz kKmt,- 

He crosses the arms. 

He crosses h i s arms. 

Via Laitung zaKbKach. 

The line broke. (Not: The managemen t broke.) 

Viz ItitanQ kontKollitKtz da6 Untzn.mhman. 

The management controlled the venture. (Not: The 1 i ne 
controlled the venture.) 

He has succeeded. (The experiment) 

WiA, kontKOttidKtzn dlt ItitunQ . Sit voaK ztKbKOdhtn. 

We checked the line . It was broken. (Not: We checked 
the managemen t . It was broken.) 

^iK kontKottitKttn din Ltitung . Sio. (joaK ko^mpt. 

We checked the management . It was corrupt. (Not: We 
checked the line . It was corrupt.) 

V2,K UotoA, dtK Ihatdhina i6t ztA,bA.ochQ,n . (tiiA, u}2.^dPM 6i(L 
ziLKiidkt chicktn. 

The motor of the engine is broken. We shall sent i t 
back . 

The motor of the engine is broken. We shall send the 
engine back . 

The solution of the linguistic problems exemplified by the 
sentences above requires a particular approach which has been 
an integral part of linguistic theory for almost two decades: 
the reduction of surface texts to intermediate, simpler struc- 
tures, generally called kernel strings or deep structures. Past 
experience' has shown that quality mechanical translation cannot 
be obtained by trying to map input surface strings directly into 

vii 
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output surface strings or by trying to derive the meaning of a 
text directly from its surface form. 

The Linguistics Research System (LRS) has been designed to 
overcome and solve such linguistic problems; LRS incorporates 
the components of current linguistic theory, such as surface 
component, deep component, transformational component; and se- 
man t i c component . 

The work described in this report complements the descriptio 
of the efforts towards the attainment of Quality Mechanical Trans 
lation in the areas of theory, programming and linguistics given 
in earlier reports [1,2,3]. 
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SECTION I 



GRAMMAR FORMALISM 



During the five months' contract period a small number of 
changes pertaining to the rule format were made. Certain of these 
have already been described In our Final Report [k]; others were 
added to increase the linguists' ability to reduce the number of 
intermediate analyses. The changes pertain to the word grammar, 
the syntactic grammar, and the choice rules. 



1.1 Wo rd G ramma r 



The M-Operator (Ma rg i na 1 -Symbol Operator) 

The M-Operator allows the linguist to establish potential 
sentence boundaries. Thus, rule C 1 '♦3 

C ]ki V WORD V BLANK V CONJ V BLANK 

M $ CJ(S) B 

B 

associates a potential sentence boundary with the blank spaces 
preceding and following a sentence-conjoining conjunction. The 
expression M 2,^* Is to be read as ''insert potential sentence 
boundary Into the second and fourth rule term*'. 

Clause rules are sensitive to such boundaries and only apply 
to text spans bounded by them. 



1.1.2 The l-Operator (Insert Operator) 

By means of the insert operator, the linguist can add sub- 
scripts and values to a constituent. These can be referred to 
during syntactic analysis and can be used to restrict the number 
of possible intermediate interpretations or to avoid readings 
which would eventually not be well-formed. 



Examp 1 e : 

C 162 V WORD 

I 2PA 

I 3FD 



V 


OET 




V 


AOJ 


$ 


GO 




$ 


GO 


$ 


CA 




$ 


CA 


$ 


NU 




$ 


NU 


? 


IN 




$ 


IN 




2.1 ,3 


.1 


F 






2.2,3 


.2 






. 


2.3,3 


.3 








2.'»,3 
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The request I 2PA stands for ''add the subscript PA ("precedes 
adjective") to the second term"; I 3F0, for "add the subscript 
FD ("follows determiner") to the third term". 



1.1.3 The P-Operator (Preference Operator) 

The P-Operator allows the linguist'to select one reading 
from multiple interpretations of a particular text span. The 
following two rules illustrate the occurrence of such multiple 
interpretations : 



C 129 



V AJ 

+ CL(P,C,S 
T.EST) 

+ FOX (A) 

+ TOX(HU,A 
L.PL, IN, 
NT.AB) 

$ 2.1VB 

= 2 



V VB 

$ FM(PAPL) 
$ FO(A) 
$ TO('VR) 
$ TS 



C 6 



V 


A 


V 


VB 


+ 


CL(P,C,S 


$ 


FM(PAPL) 




T.EST) 


$ 


FO(*A) 


$ 


2.2 


$ 


TO 


$ 


2.3 


$ 


TS 


$ 


2.'»TM 






$ 


2. IVB 







Rule C 6 interprets adjectival past participles of intransitive 
verbs as adjectives, as for example in dcH. gd^attane. Schnac 
Rule C 129 interprets those of transitive verbs, e.g., dai dufLCh- 
gc^'dhfLto. EKpe.fLime.nt. 

Verbs which occur with an optional accusative object, as in 
dai ge-le-icne. Bach, will consequently receive two interpretations, 
although only the latter interpretation Is correct. After the 
application of rules C 14 and C 28 only the correct interpretation 
of the past participle is retained. 



C U 



C 28 



V A 
+ OX 
$.2.6TM 
$*2.5FO 
$*2.6T0 
A 2 

V WORD 
P 



V AJ 

$ FOX 
$ TOX 
$ FO 
$TTO 

- 2.1 ,2.3 

- 2.2,2.'» 

V LB 



V ADJ 
$ OX 
6 



V RB 
B 
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1 • 2 Syntactic G ramma r 



1.2.1 Arguments of Operations 

The capabilities of the analysis algorithms we re increased 
to permit operations between subscripts of the same term. As 
rules C 129 and C ]k above show, it is thus possible, for example, 
to extract the semantic values associated with the accusative 
object governed by a verb [F0(A)] and make them the semantic 
features of the noun to be modified by the adjective [TM] . 



1.2.2 Negation of Conjunctions and Disjunctions of 
Va 1 ues 

The capability of expressing "negate** and **ignore** for con- 
junctions and disjunctions of values was added. Such expressions 
have the format - : (val ue-connector-va 1 ue [ connecto r- va 1 ue] ) : 
(The terms in brackets indicate repetitions of connector value 
combinations.) If value combinations are to be ignored, the 
asterisk is replaced by a minus sign. 



1 . 3 Syntact i c Choi ce 

It is frequently possible to determine, during syntactic 
choice, that the span analyzed by a particular rule will also 
have been analyzed by another rule. Since syntactic choice makes 
use of semantic information, a decision between the two rules can 
often be made. If a rule could thus be rejected by means of the 
nth decision in a choice rule, it had been necessary, so far, to 
ask in the main rule whether the nth decision was true. If so, 
the rule was deleted; if not, it was retained. 

The formalism was extended to allow the linguist to state 
the rejection of the main rule directly in the choice rule. It 
is thus no longer necessary to ask in the main rule whether the 
nth decision was true. The rejection is effected by the state- 
ment C (conditions), T/E, or F/E; the conditions in parenthesis 
are optional. The "return to the main rule** (cf. Final Report [3]) 
is now expressed by T/R or F/R. 
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1.3 



SECTION II 



SYSTEMS 



Certain of the programs described below were begun under con- 
tract F3O6O2-7O-C-OI 18; the others were written completely during 
the five months* reporting period* (The programs completed under 
the previous contract period are described in [l,2,3l*) 

During this reporting period the programming effort was 
divided into two areas: grammar maintenance programs and systems 
p rograms • 

2 . 1 Maintenance P rograms 

2.K1 Ru 1 e Modi f i cat ion 

The capabilities of SUBGRM [1] were extended to permit the 
modification of individual parts of a rule. 

The rule modification portion of SUBGRM alters a specified 
rule according to the instructions of the command. 

A modify [m] command has the following format: 

M RN(.DN) » where 

RN = rule number 

DN = duplication number. 

This command is followed by any number of insert [I] or 
delete [D] instructions. These instructions refer to the exact 
portion of the rule that is to be altered. 

The particular portions of a rule that can be modified are: 
the rule number, one or more complete terms, or any portion of a 
particular term — its subscripts (including operators, subscript 
name, and values), one or more of its operators (not to be con- 
fused with subscript operators), and any portion (line) of its 
choice sets. 

The modified rule is then written on a file to be later put 
on one of the grammar tapes. 



I I Til 



14 



2-1-2 Conflation of Dictionary Rules 



German and English verb entries are entered by the linguists 
into the grammar as t>ften as the entry has different meanings 
and/or selection restrictions- This facilitates the actual en- 
coding and updating of the verb dictionary- During grammar com- 
pilation the various entries of one verb are conflated to a single 
entry to reduce the size of the necessary storage and the number 
of internal analyses. The conflation is performed by the dic- 
tionary rule conflation program- 



This program takes on dictionary rules, R, , R^,...R , each 

i 2 m 

with the same rule number, left-side category symbol, and right-- 
side, where each R- has a set of left-side subscripts with N 

M 

columns, and constructs one rule having T = I N-- columns. 

1=1 ' 

A list of subscripts is input to the program and only the 
subscripts appearing in this list are columnized (each column 
being separated by an apostrophe). If a columnized subscript is 
in one of the R, rules, a LA is inserted as the column 



missing _ ... 

value. If a subscript is not 
to a single entry and each of 
are separated by commas. 



to be 
the R 



columnized, it is conflated 
subscripts, if they differ. 



Any trace information contained 
carried over to the conflated rule. 



n i nd i vi dua 1 rules is 



For example: 



4082 
1 



C 4082 
0 2 



V 
+ 
+ 
+ 
+ 
+ 
+ 
T 

V 
+ 
+ 
+ 
+ 
+ 
+ 
+ 
T 
T 



V 

CL(56) 

PX (FORT)/ 

FS(N)/ 

TS(IN)/ 

FO(A)/ 

TO(R) 

1 .2 

V 

CL(56) 

PX (VORW)/ 

FS(N)/ 

TS(AL)/ 

FO(A)/ 

TO(R)/ 

OA(DIR) 

1 .2 

1.6 



BEWEG 



* BEWEG 



OA 



The subscripts to be columnized are PX, FS, TS , FO , TO, and 
The resulting conflated rule is: 
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C 4082 V V BEWEG 
+ CL(56) 

+ PX(F0RT9«V0RW)/ 
+ FS(Ni«N)/ 
+ TSdNi^AL)/ 
+ FO(Ai«A)/ 
+ TO(Ri«R)/ 
+ OA(LAi«DIR) 
T 1 .2 
T 1 .6 



2.1-3 CRLN IMG (Create Line Image) 

CRLNIMG is the inverse of SUBGRM [1] which converts rules 
from "print image*' to "unpacked format". 

CRLNIMG constructs the "print image of a rule" from its "un- 
packed format". The "print image of a rule" is the form in which 
the rule is originally coded by the linguists. The "unpacked 
format" is the internal representation of the rule. All programs 
which display rules use this routine. 

CRLNIMG is flexible in that it allows the calling program 
to specify the form of the print image in regard to spacing 
between terms and the width of each term in characters. 



2.1.4 GRMDI S2 (Select Symbols with Particular Features) 
Purpose : 

Read in one or more tapes containing distinct or mixed 
gramma/s. If they are mixed, the program separates the rules so 
that they may be used as distinct grammars. Distinct grammars 
may also be displayed together. 

A display is generated for each request. A request consists 
of a grammar or grammars with or without specifications, and one 
or more sorts of the grammar(s). 

The grammars are dictionary, syntactic, word, normal form 
(NF), NF zero and NF non-zero. Their specifications allow for 
using only certain types of rules from the grammars. If no spec* 
ifications occur, all rules in the grammar are used. Term, cat* 
egory symbol, symbol name, and value may be specified. For 
examp 1 e : 

TZ (1 : :CAT(N)) ,R 

takes the NF zero grammar, chooses the rules whose left sides have 
the category "Noun", and displays a recognition sort of those 
ru les . 

11-6 
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The other sorts are form, production, and analysts. Form, 
production, and recognitior* of the dictionary grammar may also be 
reversed or smashed. 

The requests are unpacked one character per word. Any rule 
specifications are stored in unpacked internal representation In 
array IRULE. The grammars and sorts are stored in a 10 x 4 array 
[GRMSRT], where column 1 contains all combinations of form sorts; 
column 2, all combinations of production sorts; column 3, all 
recognition sorts; and column all analysis sorts. Each entry 
in the columns has the lower 6 bits set to OIB, 02B, 03B> which 
stand for standard, smashed, and reversed, respectively. The in- 
formation for each grammar in the entry is 18 bits. Bits 1 to 6 
are 03B = dictionary grammar, O^B = syntactic, 05B = word, 06B = 
NF zero^ or 073 = NF non-zero. Bits 7-12 are a pointer to the 
block in IRULE which contains specifications for each grammar in 
this particular sort. If the pointer is zero, there are no 
specifications and all rules are to be sorted. Bits 13 to 18 
point to a block in IRSP which contains the speci fi cat ions . i n a 
form used for printing sort titles. 

The grammars are read, unpacked into Internal representation, 
and written on tape 3> 5, 6 or 7, depending on whether the 
rules are dictionary, syntactic, word, NF zero, or NF non-zero, 
respect i vel y . 

When a rule is used in creating a display, a line or print 
image is created for it by "Create Line Image" (cf. 2.1.3) and 
written on tape 13f 15> 16 or 17 for dictionary, syntactic, 

word, NF zero or NF non-zero, respectively. After the line images 
for a grammar have been created, a flag indicating this is stored 
in LIST so that the action will not be repeated if the grammar is 
used more than once in a particular run. 

Each rule In the grammar(s) being used is read and tested. If 
it meets the specifications (when they exist), a sort key for the 
particular sort is created. The data for the sort key is the line 
image of the rule. These are written on tape 9 which is given 
to SRTMRG, with common block LOSORT being 1-3=0, ^I^SLTAPES, 5=512, 
6=6LTAPE10, 7-9==0. After the sort, the sort keys are thrown away. 
Tape 10 then contains the rules ready for output. 

Use: 

jngut: 

Control paramaters - each on separate card. 

NTAPES- 1 2-numbe r of input tapes. 
REQ- 1 2- numbe r of requests to be read in. 
IN-80R1-RE(1 cards, each containing one request* 

M-7 
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A request is in the form A+A 1 . . . A , B , B , . . . B , where 
A = a grammar, with or without specification; 
B = a sor t . 

A's are of the form G (T : CS : SN (V ) ) , where 
G = grammar, Its values are: 

D - d i ct i ona ry 
S - syntactic 
W - word 

T - NF zero and non-zero 

T2- NF zero 

TN- NF non-zero. 

T = term, its values are: 

1 - lef t-s i de term 
2-^N-(N-l)th right-side term 

CS = category symbol, may be any legal category symbol 
SN - any legal symbol name 
V - any legal symbol value. 

B*s may have the values: 

F - Form 

P - P roduc t i on 

R - Recogn i t ion 

A - Analysis 

FX- Form smashed 

PX- Production smashed 

RX- Recognition smashed 

FR- Form Reversed 

PR- Production Reversed 

RR- Recognition Reversed 

Smashed and reversed sorts may be used only with dictionary 
rules. Analysis with smashed and reversed options is equivalent 
to recognition smashed and reversed. 



G and T mus t be present with CS and/or SN . 
be used only if SN is also used. 



The V op t ion may 



The grammar rules which are to be displayed 
to SUBGRM. The output from SUBGRM is the input 



a re input 
for GRMDIS2 
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Outgut^ 



The number of rules occurring for each grammar and an expan- 
ded version of each request are printed first. The expanded 
version of T ( 1 : ABO VE ) , F , P is: 

FORM NF NON ZE RO ( 1 : ABOVE) + NF Z E RO ( 1 : ABOVE ) 
PRODUCTION NF NON ZERO ( 1 :AB0VE) + NF Z E RO ( 1 : ABO V E ) 

The remainder of the output consists of the displays in the 
order: form sorts, production sorts, recognition sorts, and analysis 
sorts. The first page of each sort has a heading, the date, and 
expanded version. The rest of the pages have expanded version 
and page number. 



^EADPF(U90,SUBGRM) 
COPYBR( INPUT, R) 
RFL, 77700. 
NOREDUCE. 
SETCORE. 
SUBGRM( , , R,A, B) 
RETURN (SUBGRM,PSUBGRM,R) 
REWI ND(A) 
RETURN(TAPE1 ) 
_BENAME(A,TAPE1 ) 
READPF(0896,GRMDI S2) 
RFL, 77700. 
fNOREDUCE. 

LbequestCtapei ) 

SETCORE. 
GRMDIS2. 



*For data on cards which must be run through SUBGRM. 
**For data on tape which is the output from SUBGRM. 



2 . 2 Systems Programs 

2.2.1 Word Choice 

Word Choice is performed from right to left, based on the 
files which contain the antecedent (left side) WORD. 

Word Choice begins by saving all FE's built upon by WORD 
rules. It then deletes all FE's whose antecedent is either WORD, 
LB, or RB. Then Word Choice procedes to go through a "flagging- 
unflagging" procedure wherein only the FE's which syntactic anal- 
ysis will use are unflagged; all other are flagged. Word Choice 
unflags all the FE's directly built upon by WORD rules. 
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The SIX instructions executed by Word Choice are as follows: 
K Rejection (R) - R T j , , . . . . 

This operator instructs Word Choice to delete the FE*s whose 
antecedents correspond to the right side terms Tj jT^, . . . ,Tj^ of 

the rule whose antecedent contains the D operator. All FE*s 
building on T j .T^ , . . . .T^^ must also be deleted. 

2. Insert (I) - I nSN(X) 1 n.m(X), where 

n = term number 
SN = subscript name 
X = s ubsc r i p t va 1 ue 
n*m= locator* 

This instructs Word Choice to insert the subscript SN with 
the value X in the computed left side of the FE whose antecedent 
corresponds to n. If the locator (n.m) format is used, the value 
X is added to subscript m of term n. 

3. Preference (p) indicates that a particular WORD reading 
should have preference over all other WORD readings within the 
span of this WORD rule* All readings within this span that are 
not dominated by the P operator are deleted. 

k. Superf I aggi ng (S) - S T j , T^ , . ♦ . , Tj^ guarantees that a FE 
stays flagged and cannot be unflagged by a WORD rule. 

5* Deletion (D) - D T ^T^ , . • . ,T|^ • This performs the same 

operation as the rejection operator. However, it also unflags 
all FE*s directly built upon by the deleted FE. 

6. Setting of M conditions (M) - M T j .T^ , . . ♦ ,T|^ . This in- 
struction Introduces an *'M'* operator into the condition columns 
of each file preceding and foiling T. (l<^i<^N) (cf. p. ). 



2.2.2 SYNT A (Syntactic Analysis) 

Syntactic Analysis is identical to Word Analysis [2] with 
the following exceptions: 

1. The workspace input by SYNT A Is in WORD format and needs 
only to be loaded. The workspace input by WORD A is In DICT format 
and is first converted to WORD format. 



2. While WORD A can build upon any FE, SYNT A can build only 
upon unflagged FE*s. The flagging and unflagging of FE*s Is done 
by WORD C. The Information as to whether a FE is flagged or not 
Is stored as part of the pointer of each FE. 
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3. While WORD A generates a list of all FE's where WORD 
was constructed, SYNT A generates a list of all FE's where S was 
cons t ructed . 



2.2.3 SVOL 

The SVOL package performs the four functions: 1) subscript 
check, 2) value check, 3) operations, and k) left-side construc- 
tion. It is used by the majority of the analysis, synthesis, and 
cho i ce p rog rams . 



2.2.3.1 SC CHECK 

SC CHECK Is called with three terms: 1) a term number, 2) a 
rule term, and 3) a workspace term. It checks to see that each 
subscript present In the rule term has a corresponding subscript 
in the workspace term. It can also check to see If either a 
negative condition occurs (i.e., that a subscript must not be 
present), or a don't-care condition ( i . e . , . es t ab 1 i sh whether or 
not the workspace subscript exists, but without any decision being 
based upon the outcome). SC CHECK returns either true or false. 



2.2.3.2 VA CHECK 

VA CHECK is called with one item, a term number. For each 
subscript In the r'Oe term a condition check, if necessary, Is 
performed to detetuiine whether or not the corresponding workspace 
subscript has a specified arrangement of values. This check is 
performed for each column. Any column that fails the condition 
check Is eliminated from further processing. This elimination 
occurs for every workspace subscript containing this column. 
VA CHECK returns either true or false. 



2.2.3.3 OPER 

OPER uses as its input all the terms with which SC CHECK has 
been called. All the operations are first extracted from the rule 
terms. They are then sorted into execution order, since one 
operation can build upon the result of another operation. Third, 
starting from the beginning of the list of operations, a determina- 
tion is made as to how many consecutive operations must be per- 
formed In parallel due to the fact that the workspace subscripts 
they build upon are slashed together. This group of operations 
is then re sorted so that all difference operations can be done 
first, intersections second, and summations last. This group of 
operations is performed in parallel and then removed from the 
list. Step three is repeated until either all the operation are 
completed or at least one operation has returned false. OPER 
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returns either true or false. 



2.2.3.^ LS CONST 

LS CONST IS called with one item, a rule term* The rule 
term in this case is a set of instructions specifying how to build 
a new workspace term from all the preceding workspace terms and 
results of operations. LS CONST returns the new workspace term. 

2.2.^ STAN A (Standard Analysis) 

Standard Analysis takes as input l) the workspace output by 
SYNT C, and 2) a standard grammar. It performs analysis over the 
sentences in the workspace using the standard grammar. The work- 
space is in PARALLEL format (see 2.2.5). 

The analysis procedure used is similar to that employed by 
WORD A and SYNT A, except that STAN A performs the analysis over 
all standard strings of each sentence "simultaneously". This is 
accomplished by performing first the analysis for standard string 
number I from beginning to end in exactly the same fashion as 
SYNT A. During this process, each time a new FE is built the FEO 
(file entry directory) constructed records all the standard strings 
for which this FE applies. In addition, for each standard string 
greater than 1 For which this FE applies, a flag is set to record 
that this FE needs no further processing for the remaining standard 
strings. Also, each time the analysis determines that a new file 
needs to be processed but that the results of such processing can 
only be relevant to standard strings greater than 1, all relevant 
information is ?.'ived until that same FE is again processed for 
a subsequent standard string. 

After standard string 1 has been completely analyzed the 
process is repeated for standard string 2 from beginning to end. 
But this time an FE can only be in one of three states: 

1. If the FE did not exist in any previous standard string, 
it is analyzed normally. 

2. If the FE exists in a previous standard string, and if 
previous analysis of such a standard string has computed that no 
further rules can apply, this FE is not processed. 

3* If the FE exists in a previous standard string, and if 
previous analysis of such a standard string has resulted in a 
portion of the analysis for this FE, the analysis resumes where 
it left off before. 

This process Is repeated for each standard string of the • 
sentence* 
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2.2-5 OUT WS 



OUT WS (Output Workspace) is a subroutine called by SYNT C 
(Syntactic Choice [3]) to rebuild the output of SYNT C into a new 
wo rkspace . 

The output of SYNT C is a tree which represents all of the 
standard strings and their component parts for all ''S" readings 
covering the same span and for variables indicating whether or 
not there are '*S'' readings. When there are no '*S'* readings, 
a message is printed to that effect and the variables are printed. 
Otherwise, the tree which represents n standard strings Is re- 
built into n trees, each representing one standard string. 

For each set of standard strings a Sentence Directory is 
built. The first word In a Sentence Directory is a pointer to 
the first word In the next Sentence Directory, so that they are 
connected by a one-way list. The next two words contain pointers 
to the first File Entry Directory and to the last one. 

File Entry Directories are also connected in a one-way list 
by their first words. They contain a pointer to the first word 
in the File Entry. The remainder of the File Entry Directory Is 
the File Entry which already exists In the original workspace, 
it is merely transferred from one part of storage to another, 
changing any pointers to make them relative to their new positions 

When all Sentence Directories have been created, the word In 
the last Sentence Directory which points to the next Sentence 
Directory Is set to zero to show that there are no more Sentence 
Directories. The Sentence Directories are then transferred to a 
tape for use by other programs. The first two words on this tape 
are ''S" and "Next'', where "Next" is a pointer to the first word 
after the end of the workspace. These two words constitute the 
first block. Then the Sentence Directories are written out In 
blocks of 511 words. The last block consists of one word which 
IS "^^END>' 
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2.2.6 TRAN CH (Val Idity Check of Normal Form Rules) 

This program determines whether the degree associated with 
the normal form expression (NF expression) is equal to the sum of 
the non-terminated non-terminal nodes of the standard subtree 
analyzed by the NF expression. 

TRAN CH uses as Input all the NF rules and all the standard 
grammars (Dictionary, Word, and Syntax). All the grammars have 
been preprocessed and a sorted file has been made which contains 
only the rule number, left-side (category symbol only), number of 
right-side terms, and the right-side terms themselves (category 
symbols only) for each rule. 
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Given this stripped version of the grammars, each NF rule 
is taken and the degree for that rule computed* This computed 
degree must then agree with the degree coded for that rule. If 
there is a discrepancy, the NF rule is in error* 

In addition, the first occurrence of an NF name determines 
the degree for all subsequent rules having the same name. All 
rules having that NF name must have the same degree or they are 
in error. 

2.2.7 IRAN TC (Normal Form Tree Construction) 

The sorted normal form grammar rules and pseudo rules (alpha 
switch rules) [^J are used as input to TRAN TC. From this input 
TRAN TC constructs trees representing the compiled NF rules. 
(The process is very similar to Dictionary TC [1] except that the 
right side consists of rule numbers instead of characters.) 

TRAN TC reads in one entry at a time, comparing ?t term by 
term with the previous entry. Whenever a term in the new rule 
differs from the previous rule, a down pointer is attached to the 
last matched term in the previous rule to indicate the place where 
the new rule continues. 

If the old rule is a subset of the new rule, a right pointer 
is attached to the old rule to indicate the continuation of the 
new rule. 

Each entry may contain attached information consisting of 
1) subscript packages, 2) left-side package (rule termination), 
or Z) connectors which indicate this rule involves a switch rule. 

Each time a root term (rightmost term of a rule) is encoun- 
tered—if It is the first time that term has been encountered 
(duplication numbers do not count) — a pointer to its position in 
the tree is placed in an index table according to the root terms* 
numeric value (block and word number modulo 511). To avoid large 
blocks containing no pointers, an index to the index table was 
constructed. 
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3 • 1 German Syntax 

3*1*1 Clause Description 

3*1*1*1 Strategy for Clause Description 

German clause level constituents can be permuted to a large 
extent. The number of clause patterns made possible by the re- 
arrangement of constituents is not affected by the subscript 
grammar. In order to reduce the number of clause patterns^ it 
was necessary to restrict the categories which were permitted to 
occur on clause level. Those which we utilized are: surface sub- 
ject» predicate^ surface object, and adverbial. 

3.K1.K1 Surface Subject 

The surface subject appears as ^ cA(N,CL) ^'^'^'^ signifies 
that the surface subject is either a noun phrase in the nominative 
case, or a clause. The surface subject may dominate noun phrases 
and subject clauses; among the JoCter are da44-clauses and verbal 
clauses. The German word e4 is interpreted both as an adverbial 
and as a noun phrase. In its anticipatory usage it is interpreted 
as adverbial as in 

E6 baf^akxdtn 6lch dKti Lzata Im limmt^, 
E>t hattn 2.6 auiQZQzb^n, Ihn davow abzub^inQ2.n. 



3.1.1.1.2 The Predicate 
The predicate is realized in three versions: 

V PRED, 

V PRED ... V PRFX 

V HODAUX ... V VERBAL 

PRED dominates the finite verb form of a regular verb or a 
modal or auxiliary which^is used as a full^verb; it also dominates 
the concatenation MODAUX VERBAL and VERBaOiODAUX . 

PRFX dominates separable prefixes. 

HODAUX dominates finite forms of modals and auxiliary verb 
forms; the German verb ta66tn is classified as being a possible 
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mo da 1 9 as in 

L't rfcn MaiMi von dcm Vctuktlv baobadittn. 

The German verbs befeommen and QKhaCtan a^'e classified as potential 
auxiliaries which form the passive, as in 

Ek btkam {(LKhlatt) cln Bach ge^c/ienfe^. 

VERBAL dominates non-finite forms of full verbs, za- i n f i n i t i ves , 
and concatenations of non-finite verb forms and nonfinite modals 
or auxi 1 t aries * 

3* K 1 ♦ 1 -3 Objects 

V NP 

Objects are realized as ^ This symbol dominates noun 

phrases and object clauses (cf* 3*1*1*1)* 



3- 1 • 1 • Adverbials 

V ADV 

Adverbials appear as ^ • They dominate one-word adverbs, 

prepositional phrases, prepositional objects, subordinate clauses, 
and noun phrases which function as adverbials of extension in 
space and time, as vidAzahn Ja/i-te in 

E4 a^btltttt vld^zthn JatiKt. 



3-1. 1-2 Clause Patterns 

Our linguistic description of three chapters of a work on 
aeronautics, Raketenantriebe : Ihre Entwicklung, Anwendung und 
Zukunft [5], resulted in approximately 1100 different clause 
patterns for the 150 pages of text. Rather than describe only 
the occurring patterns, we decided to generate the complete set 
of clause patterns systematically* This was done in order to de* 
termine the extent to which coverage would be sufficient for the 
analysis of these I50 pages as well as for the remainder of this 
text or any other text. 

The following basic patterns were assumed* 

1 • S-P 

2- S-P-A 

3- S-P-A-A 
k. S-P-A-A-A 
5- S-P-A-A-A-A 
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6. S-P 

7. S-P 

8. S-P 

9. S-P 

10. S-P 



0 

0-A 
0-A-A 
0-A-A-A 
0-A-A-A-A 



1 1 . 


S-P-0-0 


12. 


S-P-O-O-A 


13. 


S-P-O-O-A-A 


U. 


S-P-O-O-A-A 


15. 


S-P-O-O-A-A 



16a. 


P 




16b. 


P-A 




17. 


P-A 


-A 


18. 


P-A 


-A 


19. 


P-A 


-A 



20. 


P-O-A 


21 . 


P-O-A-A 


22. 


P-O-A-A 


23. 


P-O-A-A 



where S = subject, P = predicate, A = adverbial, 0 = object. The 
patterns l6a-23 can occur only in the passive voice or imperative 
mood * Examp 1 es : 

i^)undQ. gtanbtitat, u)unde danan ga^anbuiteX? (P-A) 

The number of variants for the 24 basic patterns was almost 
tripled by the fact that the predicate can occur in the three 
different versions, as described above un de r 3 • 1 ♦ 1 ♦ 2 . Moreover, 
all terms except P were allowed to occur in any position* (The 
restrictions on the position in which P may occur are given below.) 
The decision to permit free permutation of all non-predicate clause 
constituents may seem excessive* However, repeated inspection of 
doubtful arrangements has shown that in all instances perfectly 
well-formed German sentences could be so constructed. 



3*1 . 1 • 3 Res t r i c t i on of P red i ca te Pos i t i on 

The predicate may occur only in four positions: as the first, 
second, penultimate, or ultimate term of a rule consequent. 
Examp I cs : 

Bt 6tand tn. danau^? En. bc^ tand immdn danauf^. [Wain i^n. 
imtmn dan^au^ bt^tand , Rtdht zu /laben. 
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If the predicate consists of two elements, the finite verb part 
Is restricted to positions one or two, the non-finite to the 
penultimate or ultimate. Note that if PRED, VERBAL, or PRFX 
occurs in penultimate position (where penultimate is not the 
first or second position), the ultimate position must contain a 
c 1 ause • 



3*1. I. ^ Coding of Clause Patterns 

The projected task of clause description is represented in 
the following chart. 

Number of Possible German Syntactic Pattern Variants 

Patterns with 2 right-hand terms: 

1. CLS^-S P 

16b. CLS-t-P A Passive only 



Sum : 

Patterns with 3 terms: 

2. CLS^S P A 

6. CLS^S P 0 

17. CLS^P A A, o . , 

20. CLS-P 0 A> Passive only 



Sum: 



Patterns with k terms 



5 


rules 


3 


rules 


8 




17 


rules 


17 


rules 


6 


rules 


52 




31 


rules 


62 


rules 


31 


rules 


24 


rules 



3. CLS-f-S P A A 

7. CLSf-S P 0 A 
I 1 . CLSf-S POO 

18. CLS-f-P A A A, „ . , 

21. CLS-P 0 A A^ Passive only 

Sum: 156 

Patterns with 5 terms: 

k. CLSf-S P A A A 43 rules 

8. CLS-f-S P 0 A A 129 rules 
12. CLS-f-S P 0 0 A 129 rules 

19. CLSt-P A A A A,_ . , , 

22. CLS-P 0 A A A>P^^^'^^ ""'^ ^2 rules 

Sum: 341 
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Patterns with 6 terms: 



5. 
9. 
J3. 
23. 



CLSt-S 
CLSh-S 
CLSh-S 
CLSh-P 



}Pass i ve 
on I y 

S urn : 



Patterns with 7 terms 



10. 



CLS 
CLS- 



■S P 
■S P 



0 A A A A 
0 0 A A A 



S um : 



Tot a 



56 rules 
220 rules 
276 rules 

^0 rules 

592 



335 
670 



1005 



33'*0 



rules 
rules 



Pat te rns 


with 


8 


te rms : 






15- 


CLS^S 


P 0 0 A A A A 


1 185 


rules 


Tota 1 s : 












Patterns 


with 


1 


t e rm 


1 


ru 1 e 


Pat te rns 


with 


2 


t e rms 


8 


rules 


Pa t tne rs 


with 


3 


t e rms 


52 


rules 


Pat te rns 


with 


k 


terms 


156 


rules 


Patterns 


with 


5 


t e rms 


3^1 


rules 


Pa t te rns 


with 


6 


t e rms 


592 


rules 


Pat te rns 


with 


7 


t e rms 


1005 


rules 


Patterns 


with 


8 


terms 


1 185 


rules 



Work on the generation of those variants has resulted in 315 
rules coded so far* 

In order to reduce the number of necessary clause rules to 
more readily managable proportions, a change of the analysis al- 
gorithms to operate with set theoretical rules is envisioned* 
The number of clause rules will be further reduced by the introduc- 
tion of optional terms* 



3*1.1*5 Clause Types 



No decisions as to the type of clause — declarative, inter- 
ative, etc* — are made in the actual analysis part 
Thus, t he 
i n 



roga t i ve , re 
a c1 ause ru I e * 
the un der I i ned 



s t r i ngs 

Mann, dtK ^kn damal6 



clause rule 186 (cf* below) analyzes 
the following examples: 



of 



(LK 6agtc (16, iOQ.it QH, ilui nlckt Itldtn konntt , 
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The information necessary to determine the actual clause type is 
contained in the antecedent of the clause rule. Information per- 
taining to interrogative and relative clauses is contained under 
the subscript FM. Subordinate clauses are recognized by their 
predicate position (penultimate or ultimate) or by the subjunctive 
mood of the verb. Rules 219 and 220 below thus analyze a clause 
as a noun phrase; rule 222 determines the occurrence of a relative 
clause; rules 218, 228, and 229 analyze a subordinate clause as 
an adverb and determine the position in which it can occur in its 
matrix clause. 



C 218 



V ADV 

+ POS ( I N) 

+ EX 



V CONJ 
$ KT(S) 



V CLS 
$ FL 



V PNCT 
$ TY(COM) 



C 219 



V NP 

+ CA(CL,P+ 

CL) 
+ TY(TH) 
+ POS(FL) 
+ EX 



V PNCT 
$ TY(COM) 



V CONJ 
$ WD(TH) 



V CLS 
$ FL 



C 220 



V NP 

+ CA(CL,P+ 

CL) 
+ TY(TH) 
+ POS(MED) 
+ EX 



V PNCT 
$ TY(COM) 



V CONJ 
$ WD(TH) 



V CLS 
$ FL 



V PNCT 
$ TY(COM) 



C 222 



V CLSREL 
- 2.3 
= 2 



V CLS 

$ G 

$ N 

$ FL 

> WD 



C 228 



V ADV 
+ EX 

+ POS(MED) 



V PNCT 
$ TY(COM) 



V CONJ 
$ CJ (S) 
$ KT(S) 



V CLS 
$ FL 



V PNCT 
$ TY(COM) 



1 

S 3-^ 



C 229 



V ADV 
+ EX 

+ POS(FL) 



V PNCT 
$ TY(COM) 



V CONJ 
$ CJ(S) 
$ KT(S) 



V CLS 
$ FL 



1 

S 3-k 
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3.1.1.6 I nf o rmat ion in Clause Rules 



The decisions that need to be made in the syntactic part 
a clause rule may be described by discussing the following rul 
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The consequent part of rule C 186 (terms 2-5) analyzes any 
string consisting oF a surface subject, followed by a surface 
object, followed by an adverb, followed by a predicate. The sub- 
ject must either be in the nominative case or It must be a clause. 
The subject must also agree in person and number with the predi- 
cate. 

The predicate may be composed of an auxiliary and a non- 
flnlte verb form or It may be a finite verb. It must not occur 
with a prefix [PX(LA)], and it must govern the case of the surface 
object and may govern the case of the surface *'adverb" if it is a 
prepositional object. 

The antecedent of rule C 186 stores the tense and mcod infor- 
mation of the predicate as well as information about the type of 
auxiliary which occurs if the predicate consists of a finite verb 
and a non-flnlte verb. The antecedent also stores the information 
that the predicate is in final position [FL]. It stores the case 
of the governed object [FOl] and the case information, if any, of 
other objects which the verb governs obligatorily [FO]. It stores 
the preposition of the **adverb'' if it- is a prepositional object 
[Al] . 

If the subject is a pronoun, the antecedent stores information 
as to whether it is an interrogative pronoun [FM], or a relative 
pronoun [WD ,G , N] . 



3.1.2 Noun Phrases 

Verbal participles can function as adjectives. Thus the 
variety of lause patterns would theoretically be repeated for 
noun phrases if strings like did von un6 vo^g(i6chtagQ.m Ld6ung 
were Interpreted as In the following diagram: 




DIE VON UNS VORGESCHLAGENE LOSUNG 



Note that von un6 is the deep subject of the verb V0K6clitag<in , 
and Ld6ung its object. 
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Noun phrases are described as consisting of up to three terms 
doterminer, adjective phrase, and noun. The rule system on p. 1.2 
was designed to guarantee agreement between adjectivally used 
participles and their modificands. In the case of present parti* 
ciples, this agreement corresponds to that between the underlying 
verb and its subject. In the case of the past participle, it is 
between the underlying verb and its object. The solution was 
facilitated by the fact that such adjectival verb forms within a 
noun phrase do not occur with sentence complements* 



3.1.3 Ge rman Pete rmi ne rs and Pronouns 

German determiners are marked for gender, number, case, and 
inflection (strong or weak), as described earlier in [3l» 

In addition to gender, number, and case markers, pronouns have 
a subscript FM (form), whose values identify them as P+DEM (dem- 
onstrative pronoun), P+PERS (personal pronoun), P+REL (relative 
pronoun), P+REF (reflexive pronoun), P+INT (interrogative pronoun) 
P+IND (indefinite pronoun), P+POSS (possessive pronoun), or P+REC 
(reciprocal pronoun). All those German dictionary items which may 
function as either determiner or pronoun, depending on their en- 
vironment, were coded only once, with a complex label which con- 
tains their features as determiners and their features as pronouns 
(They' are identified as possible determiners by the value DET 
under the subscript FM . ) This prevents multiple analyses of such 
items regardless of their environment, a considerable savings be- 
cause of their frequency of occurrence in actual texts. Examples 
are shown on the following page taken from the German LRC dic- 
tionary. 
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Rule p J classifies the item dZK as determiner, relative pronoun 
and demonstrative pronoun. 

Relative pronouns were given the subscripts G, C, and N in 
addition to GD, CA, and NU . The latter set of features is for 
agreement with the following nominal in relative noun phrases: 

E6 gi!.6diah in 1963, zu i^ttchnA liUt de.KaKtig(L6 noch 

The features G and N must agree with the gender and number of the 
preceding nominal which is being modified by the relative clause. 
C (case) must contain the case governed by nhe verb: 

d^e EKpto^ion , d(iKQ.n man nock kzutz (LKinndKt . 

The subscript TY (semantic type) is added to all pronoun 
entries. It reflects the semantic type of noun each pronoun may 
represent. Thus, the German relative pronoun toa^ is assigned the 
feature TY(AB) for **abs t ra c t** ; the indefinite pronoun jemawd, the 
feature TY(HU) for **human*'; the pronoun dtK, all possible main 
semantic classes: TY (HU , AL , PL , I N , NT , AB) . 

Personal pronouns are also given the subscript PS (person) 
with the possible values 1, 2, and 3 for 1st, 2nd and 3rd person. 

Some pronouns or determiners which must be recognizable as 
a specific lexical item are marked by the subscript WD (word) 
with an identifying abbreviation as value. For example, only the 
item coa^ may be used as a relative pronoun modifying a clause 
rather than a noun or noun phrase. For this reason, it contains 
the feature WD (W) . 

The rules which analyze pronoun as NP's assign the gender> 
number, case, and semantic type features of the pronoun in the 
workspace to the NP. For personal pronouns, the PS feature is 
also assigned to the NP ; for all other pronouns, the new feature 
PS(3) marks the NP as 3rd person. 



3.2 Choice Rules 

3.2.1 Function 

It is the function of the cho ice rules associated with a 
particular syntactic rule: 

to determine the deep structure of the string interpreted 
by the rule, based on the semo-syn t act i c features associated 
with the clause constituents, and 

to generate that deep structure, called standard string. 
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This is performed by permuting the clause level constituents, by 
adding new terminal symbols (dummy terms or standard terminals), 
and by deleting certain surface terminals, such as prefixes or 
the reflexive pronoun in cases of actual reflexive verbs. {Sick 
buditdn vs. 6ich wa4c/ien; *^c/i bzzita Ikn, but Ich voa6ch^ Ihn.) 

An additional function of the choice rules is the elimination 
of forced ambiguous readings. 



3.2.2 Determination of the Deep Structure 

The deep structure of a given sentence is determined in 
several phases. 



3.2»2.1 Determination of Clause Type 

We distinguish between active clauses, passive clauses, copula 
clauses, and ta4>6(Ln clauses. La662,n clauses are further divided 
into ta66Zn clauses with an embedded active clause, and such with 
an embedded passive clause. Example: 

(Active) iiii66 dzn Jawgew den Hand 6chtag2.n, 

(Passive) tit66 dtn Hand uon dam JunQHn hthtaQtn. 

The clause type is determined by evaluating the result of the 
intersection between the values of the subscript TY of the con- 
stituent MODAUX and the values of the subscript AX of the non- 
finite verb part. The actual decisions are made by choice rule 
V CT which IS called with the information contained In the rule 
antecedent and the full verb (PRED or VERBAL). 
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Th« 4«cttions Made in this ruU arc rcpr«s«nt«rf by the following 
graph: 
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The numbers in the graph correspond to the numbers in the choice 
rule. 

The decisions made by this choice rule are not final; the 
actual determination of the clause type is dependent on the rela- 
tions between the predicate and its complements. Thus, addidKan 
{to arfrf), will first be interpreted as active voice in a sentence 
such as did ZaliJ^QH addicm^u sich zu hundiLn.t. The verb complement 
choice rule V ACrVSOA (for active complete: verb, subject, object, 
adverb) will, however, assign to this clause the interpretation, 
••passive voice''. This permits both the translation thi\ nambe.n,6 
an.e addi'd up to a huud^(>d and tkQ yiumbi\/i6 add up to a hundn.2,d. 
Similarly, ioin,d Qi'tanzt, which superficially looks like a 
passive sentence, will be interpreted as an active sentence with 
a deleted agent, which permits the translations tkay danctd or 
pcoptz danC2.d. 



3.2.2.2 Determination of Adverbials 

The choice rule called V SPECAV (special adverb) determines 
whether an adverb is the negation nidit (not), or an adverb of 
the type gd^n, lldbdK, \^i^itiLK vihxch function as deep predicates. 
The negation is moved directly behind the subject to facilitate 
the generation of the English output; it could, however, be put 
in front or behind the actual clause to indicate its operator 
status. This difference in treatment would not have any effect 
on the translation. The special adverbs of the type QQKn, ti(i.bdK 
are moved into the predicate position. The surface predicate and 
its object complements are treated as the clause complement of 
the predicate represented by the surface adverb. 

Adverbials which dominate prepositional phrases undergo 
further checking in the verb complement choice rules. There, we 
determine whether or not the adverb is a prepositional object 
of the predicate. In the final assignment statement, each adverb 
is assigned a numerical value (cf. choice set number 12 in rule 
C 186 above). These numerical values are deleted if they compete 
with an alphameric name such as N, SP, or 0, 0^, 0,. 



3.2»2.3 Verb Complement Rules 

We distinguish four types of verb complement rules: those 
which contain a passive predicate, an active predicate, a copula, 
or a form of ta^^Qn. The corresponding choice rules begin with 
the letters PC, AC, CC , and LC. For each set of these choice 
rules there exist as many alternants as there are basic sentence 
patterns (cf. the patterns in 3. 1.1. 2 above). The functions 
of verb complement rules are basically four-fold: 

a. they determine whether the subject and object of the 
verbs agree with the verb in syntactic surface appearance and 
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b. if these tests fail, we test for the occurrence of a 
different clause type (remember addiz^zn 6^ch and i^J^Kd g2.tanzt 
above) ; 

c» after that we test for the occurrence of a lexical 
collocation. This test is executed superficially only, by checking 
whether the verb and the constituent in question agree in their 
values of LC (for more details, cf. 3.k.2 below); 

d. if all of these tests fail, the main rule which called 
the choice rule is rejected* 

3*2.2.^ Superscript Assign me nt 

The clause constituents are finally connected in the sequence 
represented by the final superscript assignment (statement number 
13 in rule C 186, p. 111.22). The sequence L-S-N-SP-VC-F-M-P-LC- 
0-02-0^- l-2-3''^-R stands for 'Meft boundary, deep subject, negation, 

special adverb, voice information, tense information, auxiliary 
or modal, predicate, lexical collocation, first deep object, 
second deep object, third deep object, first adverb, second adverb, 
third adverb, fourth adverb, right boundary". Of these, only 
left boundary, subject, voice, tanse, pred i cate^ and right boundary 
are obligatory. If a constituent is assigned more than one 
alphabetic superscript name, the Cartesian product of permissible 
standard strings will be generated, with the provision, 

a. that no two identical names may occur in the same 
standard string, and 

b. that identical standard sequences which were derived 
by means of assigning different names to the same constituent 
are conflated tc one standard string. 

Currently, choice rules can only be called from a main rule, 
but we plan to extend the algorithms' capability to that of calling 
a choice rule from a choice rule. However, a restriction will be 
impose^*, in that a choice rule which had been called by another 
choice rule may not call a third choice rule. 

3 . 3 S t an da rd G ramma r 

3.3.] Economy of Standard Descriptor 

The number of standard g ramma r »-ules is fairly small. The re 
are three reasons for this: 
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a. the constituents occur in their deep structure order, 

relevant boundary information is retained by means of 
dummy terminals, and 

c* surface structure, which is identical to deep structure, 
is re t a i ned • * 

Surface structure is destroyed in those cases where each node 
labeled clause, or dominating a clause, or dominating a standard 
expanded adjective, is destroyed. Adjectives are expanded if they 
concatenate with an object complement or a sentential adverbial* 

Clause rules are destroyed because each clause rule introduces 
at least three dummy terms and these dummy terms must be incorpor- 
ated in to the standard structural description. 

Strings containing an expanded adjective, phrase are re- 
arranged to represent standard order. 



3. 3*2 Standard Clause Patterns 

Concatenations of the predicate and its complements are 
interpreted by the symbol KERN. There are exactly four types of 
structures dominated by KERN. 

1. Structures consisting of subject and predicate and no 
object . 

2. Structures consisting of subject and predicate and 
exactly one object. 

3. Structures with subject, predicate, and two objects. 
Structures and predicate and three objects. 

These four structures are represented by the following standard 
rules. 
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The symbol VB stands for verb phrase boundary. It is introduced 
in the verb complementation choice rules. 



Sentence adverbials concatenate in binary rules with the 
symbol KERN recursively. 

As can be seen from rules C ^10133 and C ^0139 above, standard 
analysis is followed by standard choice. There are only two types 
of instructions executed in standard choice: assignment statements, 
which help to select the proper translation eq u i va I en t s, an d super- 
script assignment statements, which change the order of the 
standard terms to the universal order if the two should be differ- 
ent . 



3 . k Lexi cog raphy 

During this reporting period, the lexical data base necessary 
for the analysis and translation of the text Raketenantviebe : 
Ihre Entwiaklung, Anwendung und Zukunft was completed and revised. 
The lexical data base consists of four dictionaries — a German 
monolingual dictionary, which lists the syntactic and semo-syn- 
tactic features of the German word stems in the form of sub- 
scripts and values; a German normal form dictionary, which estab- 
lishes meaning (or translation equivalence) classes for the lexic- 
al entries in the German monolingual dictionary; an English mono- 
lingual dictionary, which assigns features to Engiish word stems; 
and an English normal Form dictirmary, which establishes trans- 
lation equivalence classes for tte English dictionary items. 

3. '♦.I Verb Entries 

The general feature system for verbs., nouns, and adjectives 
is described in [l]. This system was used in the coding of mono- 
lingual dictionary rules for German and English verbs, with the 
following minor modifications: 

a. FS (syntactic form of subject required by the verb) is 
always shown as either N (nominative NP) or CL (clause); 

b. the subscript TS (semantic type of subject) lists, for 
those verbs which allow such subjects, the values TH (^^la^- c I ause) , 
Ml (marked infinitive), ICL (interrogative clause), FT {^oK-to 
complement, English only), or IT (impersonal subject (L6 or ), 

in addition to semantic noun classes; 

c. the earlier subscript OS (deep subject) was replaced in 
the dictionary rules by IS (interpretation of subject). Its 
possible values are S (subject, where surface and deep subject are 
identical) and 0 (direct object, where the surface subject repre- 
sents the deep object); 
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d. analogous to the subscript IS, the new subscript 10 
(interpretation of object) was introduced, with the possible 
values 0 (direct object), 02 (indirect or "second object), 03 
(third object), and S (subject; for those verbs whose surface ob- 
ject represents the deep subject as In dlZ6zn. l/e>^4ac/i gdlang Ikm 
= /le ^acceederf In tlU6 attempt); 

e* the subscript OB was changed to FO (form of object). Its 
possible values remain N, G, D, A (for the cases in German), 0 
(for NP objects in English), and all prepositions which can be 
used In prepositional objects of verbs. An additional value is 
CL (clause). The values TH {that- c\ ause) , Ml (marked infinitive), 
etc. (cf. b. above) are also listed under TO (semantic type of 
object), in addition to the semantic noun classes. Double objects 
are indicated by under FO , TO and 10. A is used to combine 

two values into one, as in AUFl+CL, which stands for "the object 
consists of the preposition aajj followed by a clause", as in 
da^au^ ac/i^en, (ia>64... 

In addition to these semo-s yn t ac t i c and syntactic features, 

each verb stem was assigned the subscript CL (paradigmatic class) 

with the number of the specific inflectional class it belongs to, 

and the subscript PX (separable prefix) with the values LA 
(lambda = no prefix) or AB, AN, AUF, etc. 

For examples of German and English monolingual dicationary 
rules for verbs, cf. [3] » Pp. 11-29 and II-30. The "^" sign in 
the last two rules on both pages separates "columns" of features. 
The value ijhown preceding the ^ In one subscript forms a feature 
packet with all other values preceding the ^ in other subscripts. 
Another feature packet is formed by the values following the 
Examples of the normal form rules for the German verb and^M 
(with the prefix ab but without a separable prefix) and its 
English translation equivalents chango. and modify are marked on 
the three following pages taken from the German and English normal 
form dictionaries (pp. 111.39 and I I I . /»0 , respectively). 
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In these rules, the 1st column is a unique number identifyrng 

each rule* The second column contains the class name and certain 

information to facilitate rule inspection by the human users. 

For their convenience, the translation equivalence class names 

used are the English canonical forms, together th subscripts 

introduced by on the left side, where necessary. For 

example, the German verb dh^dtn.n with the prefix ab and the English 

verb modify are both members of the class MODIFY. No distinguishing 

subscripts are necessary. The German verb 6idi dnd^^n is in the 

CHANGE ^ . . , . . 

class pq^l;^) to insure its translation into the intransitive form 

of the English verb c/iangc. The German verb od2.n. jtmand^n 

• • CHANGE 
andd^n is in the class pQ^g) ' transitive form of the 

Engl i sh ve rb c/iaage. 

The additional subscripts are only for convenience and do 
not distinguish class names: CAT(V) stands for '^category symbol 
'verb*'' to indicate that the classified dictionary item is, e.g., 
the verb and not the noun change,. In the German normal form 
rules, an additional subscript, e.g., rM(AENDERN), indicates the 
particular German verb covered by this rule. 

The third column contains the right-hand sides of these 
normal form rules. Here, the first line contains the number 
uniquely identifying a dictionary rule in the German or English 
monolingual dictionary, respectively. Where necessary, this num- 
ber Is followed in the second and subsequent lines by subscript 
conditions which list the restrictions on the appl-icatlon of 
the normal form rule. For example, C 4002 identifies the German 
dictionary entry for dnd2.n.n . The normal form rule C 2 (page 
111-39) applies to this verb only if it is used with the prefix 
ab [this condition is expressed by $ PX(AB)] and guarantees trans- 
lation into modify. Rule C 25 applies if the verb dndHKn is used 
reflexively [$ TO(R)] and without prefix [$ PX(LA)]. Rule C 392 
applies if the same verb is used without prefix and not reflexive- 
ly [$ PX(LA) $ TO(^*^R)]. 



3 • 4 • 2 Dictionary Entries for Verbal Lexical Collocations 

3.4.2.1 Entries Without Internal Variables 

We refer to those verbs as '^verbal lexical collocations" 
which consist of a verb and a noun phrase, a prepositional phrase, 
a non-finite verb form, an adjective, or an adverb, e.g., en.^olg(Ln 
= takz ptatt; Gmizht ^aitan • 6c important. Since such 

lexial collocations have f eat u res and mean i ngs wh i ch may not be 
derived from their individual components, they must be treated as 
lexical phrases (cf. [2], pp. 13 ff.). However, since their 
components frequently occur d i s con tunous 1 y and in various sequences, 
they must be handled differently from normal verbs. For this pur- 
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pose, the subscript LC (lexical collocation) was added to the dic- 
tionary entries of those nouns and verbs which may occur as com- 
ponents of lexical collocations* Surface analysis refers to these 
subscripts and guarantees that the items occur contiguously and 
in a predefined order in the so-called '^standard string" (cf* 
I I I ♦3*2*2*4) , which is generated after surface analysis* 

For the actual analysis and translation of lexical colloca- 
tions, standard dictionary rules were coded which are applied to 
the standard strings* 

The constituents of standard strings are the dictionary 
readings of the underlying lexical elements* Standard dictionary 
rules concatenate these readings in multi-branch rules and assign 
to the whole structure the syntactic and semo-syn tact i c features 
described for the general verb system in [1] and under 1* above* 
Thus» the German standard dictionary rjle C 5011 (marked on the 
following page of computer print-out) analyzes the lexical collo- 
cation in6 G2,voickt j{a££en (6^ important]: 



C 501 1 



Standard String: 



C 4201 



C 9028 
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TS(E) 

IS(S) 

FO(LA) 
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To this standard rule (C 50II) the following German normal form 
rule appl i es : 

V BE+IMPOR C 5011 C 20868 C \kS C 9028 C k20\ 

TANT A 1 A 1 A 1 

A CAT(V+P) B 2 B 3 B k 

H TM(INS+G 
EWICHT+F 
FALLEN) 



I I l-i*3 



UPOATC PROGRAM 



C 56«« 



C 50t« 



C SOlO 



c sou 



C 9012 



c soil 



V V 
TSChU) 

FOU) 
TOIt) 
10 (u) 
FS(N) 

V 

TS(HU) 

ISIS) 

FOUUFl) 

TOtiNtAt 

tNT) 

lOiu) 

FS(N) 

V 

TS(t) 

IS(S) 

FO(LA) 

TO LAI 

FS(n) 

XOUAl 

V 

TS(£) 

ISIS) 

FO(lA| 

TO(U) 

FS(n( 

10 (LA) 

V 

TS(hU) 

ISIS) 

FO(a) 

TOl£l 

10(0) 

P(Lm) 

f$(n^ 

V 

TSinU) 

IS(S) 

FO(A) 

T0(ifjltA8 

tNT) 

10 lu) 

P(U) 

FS(n) 



C M03 
S PX(OAII) 



C lOiOl 



C 4117 



C 1*4 



C 21*1« 



C 4201 
S PX(LA) 



C 9010 



C 23170 



C 4291 
S PX(LA) 



c «e2t 



C 14« 



C 20MI 



C 4206 
t PX(LA) 



C 22SS1 



C 4206 
S PX(LA) 



C 20009 
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The corresponding English normal form rule 
lation equivalence class, BE+ I MPORTANT : 



is in the same trans- 
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V BE+IMPOR C 5123 C 20 C 10377 

TANT A 1 

A CAT(V+P) B 3 

The right-hand side of this rule refers to the English standard 
dictionary rule C 5123, which, in turn, generates the English 
s t anda rd string: 

C 5123 
V 

FS(N) 
TS(E) 
IS(S) 
FO (LA) 
TO(LA) 
lO(LA) 

C 20 C 10377 

BE IMPORTANT 

The correct endings and morphological variants (in this case am, 
aKH, i6, etc.) are generated by the English rearrangement grammar. 

German verb phrases which we call "hidden passive phrases" — 
i.e., those which contain empty function verbs such as g2.tang2.n zu 
and komm2.n zu, followed by nominalized verbs — were also treated 
as lexical collocations. Examples are: 

zaA Au66to66ang gatangQn = fae ejected 
zam Ein6atz kommzn = fae mploy2.d 

An additional subscript P identifies these German phrases as 
passive in meaning to guarantee their correct translation. The 
English translation equivalents were not coded as phrases. 

3.^.2.2 Lexical Collocations with Internal Variables 

Some lexical collocations contain variable internal slots, 
as for example the noun modifier slot in to takt 1/ C(in.(L that..., 
where V stands for "variable": fie took caKd that..., hd took 
Qn.(Lat ca^t that. . . , ha took tha gn.2.at2.6t \:>06^ibl (i caKZ that. . . , 
etc. For such phrases, standard rules v/ere written which provide 
for variables in their right-hand sides (cf. rule C 5120 below). 
Since the present rule format does not allow optional rule con- 
stituents, several rules Vv«ere coded for such phrases, one for 
each possible string. As an example, the English standard rules 

I I I -^#5 
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for take. ca4c. that are shown here: 



C 51 19 



V V 



C i»726 
!? PX (LA) 



c 23195 



+ FS(N) 
+ TS(HU) 
+ IS(S) 
+ FO(CL) 
+ TO(TH) 
+ 10(0) 



c 5120 



+ FS(N) 
+ TS(HU) 
+ I S (S) 
+ FO(CL) 
+ TO(TH) 
+ 10(0) 



V V 



C i»726 
$ PX(LA) 



V ADJ 



c 23195 



where C k726 is the rule number for take, and its allomorphs and 
C 23195 the number for ca^e. 

For some phrases, up to six rules were necessary to allow 
for optional determiners, noun modifiers, and plural noun endings, 
e.g., to p04e (DET) (ADJ) p^oblemii ] . These lexical collocations 
do not constitute set phrases, but rather, instances in which a 
verb has a specific and unusual meaning (and translation) in the 
environment of a noun phrase whose head noun is a particular 
lexical item. Beyond this, all normal rules of NP analysis and 
generation apply. The development of a new algorithm (lexical 
collocation phase) is planned to permit the expression of these 
relations in a more economical manner than the one described 
above . 

For the text Raketenantriebe : Ihre Entwioklung ^ Anwendung 
und Zukunft, the number of lexical colloction rules coded is 
approximately as follows: 

German standard dictionary rules: 165 

English standard dictionary rules: I6O 

English normal form rules: I 40 

German normal form rules: 200 



In order to classify German nouns and their translation 
equivalents adequately, several modifications of the subscripts 
and their values as described in an earlier report became necessary 
(cf . [ 1] , pp . 11-19 through I I -22) . 



3.4.3 Noun Entries 



I I I -46 



5G 



GD = gender (for German only), with the values M (masculine), 
F (feminine), N (neuter). The subscript SX (sex) is 
used for English nouns. 

FC - form of complement (syntactic), and 

TC = type of complement (semantic), replace the earlier OB 
and TO, respectively. 

In addition, the following features were coded: 

CL = **paradi gmat i c class*' (1-64 in German, A in English), 

ON = ''onset'' (English only), with the values C (consonant) 
V (vowel ) , 

10 = "interpretation of object", with the possible values 0 
02, and LA (cf, p. I I 1-38 , d J, 

TT = "tantum noun", with the values S for singular, P for 
plural (e.g., ? athltutt , Ko^tdn) , 

CP = "capitalization", with the value N (none), to mark such 

non-capitalized nouns in German as 4efe*, the abbreviation 
for Sefeande, 

LC = "lexical collocation", with the values: 



The following rules are examples of the dictionary rules for 
one German noun and its two translation equivalents, and the 
necessary normal form rules.- for their dictionary items. 



N 

NP 
PN 
PP 



noun only 

DET and/or ADJ + NO 
PREP + NO 

PREP + DET (+ADJ) + NO 



I I l-V/ 




German dictionary rule 



Eng^lish dictionary rule 



C 20078 



VN 

+ CL(20) 
+ DG(F) 
+ TY(AB) 
+ FC (LA,G, 

VON. MIT) 
+ TC (LA, IN 

. IN) 
+ IO(LA,0. 

02) 
T I .i* 



ANRE I CHE 
RUNG 



C 21 183 



V 
+ 
+ 
+ 



T I 



N 

CL(A) 
ON(C) P 
TY (CN , I N 
) 

FC (LA'OF 
'OF. WITH 
)/ 

TC (LA' I N 
' I N. I N)/ 
IO(LA'0' 
0.02) 



CONCENTR 
AT I ON 



C 20362 



German normal form rule 

C 36^3 V CONCENTR C 20078 
ATION $ 10 
$2.1 
A CAT(N) 
N TM(ANREI 
CHERUNG) 

C 390A V ENRICHME C 20078 
NT 

A CAT(N) 
N TM(ANREI 
CHERUNG) 



V N 

+ CL(A) 
+ ON(V) 
+ TY(CN) 



^•'ENR I CHME 
NT 
P 



English normal form rule 

C I 852 V CONCENTR C 2 1183 
ATION $ 10 
$2.1 
A CAT(N) 



C 3903 V ENRICHME C 20362 
NT 

A CAT(N) 



Note that in the German and English dictionary rules above, the 
objective genitive. of nouns derived from transitive verbs is in- 
dicated under the subscripts FC and TC. • 



3.^.^ Adjective Entries 

The following modifications of the adjective feature system 
we re i n t rodu ce d : 

Add|t|ons : 

CL = "paradigmatic class" (1-20 in German, A for English) 
ON = "onset" (Engl ish only) 

LC = "lexical collocation", with the value A (for German only). 

I I I -i*8 
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Changes : 

FO = "form of object" replaces the earlier OB 
TM = 



FM = 



SP = 



"type of modificand" (semantic) (for values, cf TS 
under verbs except for IT), and 

"form of modificand" (syntactic), with values NO for 
nominal, CL for clause; these last two subscripts re- 
place the earlier subscript MO 

"special adjective", with the possible values PAPL and 
PRPL (past and present participle, respectively) re- 
places the earlier subscript FM . * 

Sample Rules for Adjectives 



German dictionary lule 

C 10807 V A -.v GEBUNOEN 

+ CL(1 1) 
+ TM(MA'AB 
. IN+MS)/ 
+ F0(AN1 'L 

A)/ 
+ TO (IN 'LA 
)/ 

+ lO(O'LA) 
+ SP(PAPL) 
T 1.3 



German transfer rule 

C k7^5 V COMBINED C 10807 
A CAT (A) 
N TM(GEBUN 
DEN) 

C k7kk V DEPENDEN C IO807 
T $ 10 

$2.1 . 

A CAT (A) 

N TM(GEBUN 
DEN) 



English dictionary rule 

C 10807 V A A COMBINED 

+ CL(A) P 
+ TM (AB, I N 
) 

+ ON(C) 



C ]OkOS V A * DEPENDEN 

+ CL(A) T 
+ TM(AB,PO P 
) 

+ FO(LA,ON 
) 

+ TO(LA,AB 

.PO) 
+ IO(LA,0) 
+ ON(C) 
T 1.3 

English trans fer rule 

C U37 V COMBINED C IO807 
A CAT(A) 



355 V DEPENDEN C I0'»05 
T $ 10 

$ 2.1 
A CAT(A) 
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11 1-A9 
5,S 



The cardinal numbers from 0 to 9 were coded with the features 
TY 



ON 



= ''type of number**, with the values DG for **digit**, SP 
for **spel led out" 

= *'onset'' (Eng 1 i sh on ly ) 



V NU 

+ TY(SP) 

V NU 

+ TY(DG) 



S I EBEN 



'V SIEBEN 



r.k.S Revised Statistics of the Lexic al Data 

Ihre Entwioklung y 



for Eaketenantriebe : 
und Zukunft 



Base 

Anwendung 



German dictionary 



( 1 ex i ca 1 



900 ve rb s tems 
165 lexical verb phrases 

CO 1 1 ocat I ons ) 
860 adject i ve stems 
3,l80 noun stems 

English dictionary: 950 verb stems 

160 lexical verb phrases 

850 adject ive stems 

3 ,200 noun s tems 

German normal form: 

1 ,000 ve rbs 

200 lexical verb phrases 

860 adj ect i ves 

3 , 1 80 nouns 

Engl i sh no rma 1 form: 

1 ,000 ve rbs 

140 lexical verb phrases 
850 adjectives 
3,200 nouns 

In addition to the dictionaries for the text Raketenantriebe 
Ihre Entwiaklung, Anwendung und Zukunft, the comp i 1 a t i on of major 
lexical lists continued. The E ng 1 i sh -Ge rman adjective list now 
contains approximately 27,300 English adjectives coded with 
German translation equivalents and subject area or stylistic des 
criptors. Of these, some 12,000 have been given syntactic and 
semo-syntact i c features. 



I I I -50 
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CONCLUSION 



The results of the efforts in systems construction and 
linguistics performed under contract F30602- 73- C -0 1 92 strengthen 
confidence in the soundness of the theoretical basis of LRS 
and support the expectation expressed in the feasibility study 
that quality machine translation can be obtained. 

Future efforts will concentrate on the completion of LRS 
and Its application to an operational environment. 
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